Quarterly and Annual Ridership Totals by Mode of Transportation
The initial piece of data that was gathered comes from the American Public Transportation Association, and can serve as an introductory synopsis of the state of public transit ridership over time. This gives a broad view of quarterly ridership across the entire country from 1990 onward. Thus, this data has been chosen for the potential of setting the stage for the problem which we intend to explore.
The raw data and methodology for how it was obtained can be found using this link: https://www.apta.com/research-technical-resources/transit-statistics/ridership-report/
The data itself can be downloaded using this link: https://www.apta.com/wp-content/uploads/APTA-Ridership-by-Mode-and-Quarter-1990-Present.xlsx
To download this data, I used an R API tool, which saves the data in Excel format. Below is the code for this action and a screenshot of the raw data to illustrate its form upon download:
Quarterly and Annual Ridership Totals by Mode of Transportation
News API Data
An essential part of understanding public perception of a topic is by assessing how it is covered in the news. This often informs general opinions, and can introduce conversations that had not previously been in the zeitgeist. Thus, this paper will analyze text data from https://newsapi.org/ to allow us to study news coverage on two distinct public transit systems.
For this project, I will be looking at data regarding the Washington Metropolitan Area Transit Authority (WMATA) and the Bay Area Rapid Transit (BART). Both of these transit systems have several advantages for academic study: they are both large networks with rich histories and connections to their respective cities, there exist robust data sources allowing us to analyze information from several angles, and comparing them will allow us to get perspectives on differences between cities on opposite coasts.
The following shows how I accessed this News API via Python code. This outputs a JSON file as raw data, the start of which is included below each code block to show the nature of the data prior to cleaning.
Raw JSON data from News API; Topic: Bay Area Rapid Transit
Remote Work Trends
It is reasonable to hypothesize that one of the main factors in public transit usage is people commuting to and from work. The term “rush hour” is a seemingly daily phrase, meaning the times in the morning and evening at which most people go to or return from their occupation. Thus, when COVID-19 struck and many workers were no longer expected to go to work in-person, the need for public transportation decreased drastically.
In the years since, remote work has been a topic of controversy. Many workers enjoy the benefits of privacy and the added time of not having to commute, while employers often cite advantages of being on-site even in office jobs. While in-person work has rebounded recently, much like public transit usage, it has not nearly returned to the prevalence of prior to the pandemic. Therefore, understanding trends surrounding remote work can provide insights on how to analyze public transportation trends.
WFH Research has exhaustive data sets regarding remote work information. For the purposes of this project, we will take into account three data sets. To better understand the controversial aspects of remote work, the first two data sets contain survey information from (a) employers and (b) workers on what they desire in terms of average remote work days per week. The third data set provides time series information on the amount of working from home (percent of full paid days) for large cities, including Washington, D.C. and the San Francisco Bay Area. Screenshots of the raw data are shown below:
Remote Work Desires of Employers
Remote Work Desires of Workers
Remote Work Percentages by City
Ridership Trends for each City
Now that we have background information regarding both cities of concern, the next piece of information to gather is public transit ridership. This will give us comprehensive monthly data from 2018 to 2023 to provide insights on the nature of the decline in public transit, as well as the current recovery.
For WMATA, the data comes from https://www.wmata.com/initiatives/ridership-portal/, and gives simple average daily entries per month. Meanwhile, BART data comes from https://www.bart.gov/about/reports/ridership, which provides reports each month on entries and exits by station. The raw data for WMATA entries, as well as the most recent BART report, are shown below:
WMATA Average Daily Entries by Month
September 2023 Report for BART Entries/Exits by Station
Ridership by Hour
In addition to the volume of public transit usage, we can glean information on the purpose of public transit usage by analyzing the users by hour of the day. High peaks during “rush hour” likely indicate a great influence of work commuting on the data. Because of this, I downloaded both pre-pandemic and post-pandemic data sets regarding WMATA ridership by hour to view this relationship and whether it has changed due to new circumstances. In this case, March 17, 2020 is chosen as the demarcation date, as that was the day in which the first social distancing precautions were announced in Washington, D.C. The data is shown below:
Hourly Ridership from 1/1/2018 to 3/17/2020
Hourly Ridership from 3/18/2020 to 10/5/2023
Ridership by Demographic
In answering the question of whether or not public transit’s public service should be the paramount consideration for its efficacy, it is important to understand that it often provides service disproportionally to underprivileged groups. By analyzing demographic data, we can gather insights on who benefits most from robust public transit systems. To address this, there is data from the U.S. Census Bureau that provides 5-year estimates from 2021 of means of transportation to work by selected characteristics. The raw data is shown below:
2021 5-Year Estimate of Transportation Means by Demographic